home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
SGI Freeware 2002 November
/
SGI Freeware 2002 November - Disc 1.iso
/
dist
/
fw_gawk.idb
/
usr
/
freeware
/
catman
/
u_man
/
cat1
/
gawk.Z
/
gawk
Wrap
Text File
|
2002-07-08
|
92KB
|
1,821 lines
GAWK(1) Utility Commands GAWK(1)
NNAAMMEE
gawk - pattern scanning and processing language
SSYYNNOOPPSSIISS
ggaawwkk [ POSIX or GNU style options ] --ff _p_r_o_g_r_a_m_-_f_i_l_e [ ---- ]
file ...
ggaawwkk [ POSIX or GNU style options ] [ ---- ] _p_r_o_g_r_a_m_-_t_e_x_t
file ...
ppggaawwkk [ POSIX or GNU style options ] --ff _p_r_o_g_r_a_m_-_f_i_l_e [ ----
] file ...
ppggaawwkk [ POSIX or GNU style options ] [ ---- ] _p_r_o_g_r_a_m_-_t_e_x_t
file ...
DDEESSCCRRIIPPTTIIOONN
_G_a_w_k is the GNU Project's implementation of the AWK pro
gramming language. It conforms to the definition of the
language in the POSIX 1003.2 Command Language And Utili
ties Standard. This version in turn is based on the
description in _T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, by Aho,
Kernighan, and Weinberger, with the additional features
found in the System V Release 4 version of UNIX _a_w_k. _G_a_w_k
also provides more recent Bell Laboratories _a_w_k exten
sions, and a number of GNU-specific extensions.
_P_g_a_w_k is the profiling version of _g_a_w_k. It is identical
in every way to _g_a_w_k, except that programs run more
slowly, and it automatically produces an execution profile
in the file aawwkkpprrooff..oouutt when done. See the ----pprrooffiillee
option, below.
The command line consists of options to _g_a_w_k itself, the
AWK program text (if not supplied via the --ff or ----ffiillee
options), and values to be made available in the AARRGGCC and
AARRGGVV pre-defined AWK variables.
OOPPTTIIOONN FFOORRMMAATT
_G_a_w_k options may be either traditional POSIX one letter
options, or GNU style long options. POSIX options start
with a single "-", while long options start with "--".
Long options are provided for both GNU-specific features
and for POSIX-mandated features.
Following the POSIX standard, _g_a_w_k-specific options are
supplied via arguments to the --WW option. Multiple --WW
options may be supplied Each --WW option has a corresponding
long option, as detailed below. Arguments to long options
are either joined with the option by an == sign, with no
intervening spaces, or they may be provided in the next
command line argument. Long options may be abbreviated,
as long as the abbreviation remains unique.
OOPPTTIIOONNSS
_G_a_w_k accepts the following options, listed alphabetically.
--FF _f_s
----ffiieelldd--sseeppaarraattoorr _f_s
Use _f_s for the input field separator (the value of
the FFSS predefined variable).
--vv _v_a_r==_v_a_l
----aassssiiggnn _v_a_r==_v_a_l
Assign the value _v_a_l to the variable _v_a_r, before
execution of the program begins. Such variable
values are available to the BBEEGGIINN block of an AWK
program.
--ff _p_r_o_g_r_a_m_-_f_i_l_e
----ffiillee _p_r_o_g_r_a_m_-_f_i_l_e
Read the AWK program source from the file _p_r_o_g_r_a_m_-
_f_i_l_e, instead of from the first command line argu
ment. Multiple --ff (or ----ffiillee) options may be used.
--mmff _N_N_N
--mmrr _N_N_N
Set various memory limits to the value _N_N_N. The ff
flag sets the maximum number of fields, and the rr
flag sets the maximum record size. These two flags
and the --mm option are from the Bell Laboratories
research version of UNIX _a_w_k. They are ignored by
_g_a_w_k, since _g_a_w_k has no pre-defined limits.
--WW ccoommppaatt
--WW ttrraaddiittiioonnaall
----ccoommppaatt
----ttrraaddiittiioonnaall
Run in _c_o_m_p_a_t_i_b_i_l_i_t_y mode. In compatibility mode,
_g_a_w_k behaves identically to UNIX _a_w_k; none of the
GNU-specific extensions are recognized. The use of
----ttrraaddiittiioonnaall is preferred over the other forms of
this option. See GGNNUU EEXXTTEENNSSIIOONNSS, below, for more
information.
--WW ccooppyylleefftt
--WW ccooppyyrriigghhtt
----ccooppyylleefftt
----ccooppyyrriigghhtt
Print the short version of the GNU copyright infor
mation message on the standard output and exit suc
cessfully.
--WW dduummpp--vvaarriiaabblleess[==_f_i_l_e]
----dduummpp--vvaarriiaabblleess[==_f_i_l_e]
Print a sorted list of global variables, their
types and final values to _f_i_l_e. If no _f_i_l_e is pro
vided, _g_a_w_k uses a file named _a_w_k_v_a_r_s_._o_u_t in the
current directory.
Having a list of all the global variables is a good
way to look for typographical errors in your pro
grams. You would also use this option if you have
a large program with a lot of functions, and you
want to be sure that your functions don't inadver
tently use global variables that you meant to be
local. (This is a particularly easy mistake to
make with simple variable names like ii, jj, and so
on.)
--WW hheellpp
--WW uussaaggee
----hheellpp
----uussaaggee
Print a relatively short summary of the available
options on the standard output. (Per the _G_N_U _C_o_d_
_i_n_g _S_t_a_n_d_a_r_d_s, these options cause an immediate,
successful exit.)
--WW lliinntt[==ffaattaall]
----lliinntt[==ffaattaall]
Provide warnings about constructs that are dubious
or non-portable to other AWK implementations. With
an optional argument of ffaattaall, lint warnings become
fatal errors. This may be drastic, but its use
will certainly encourage the development of cleaner
AWK programs.
--WW lliinntt--oolldd
----lliinntt--oolldd
Provide warnings about constructs that are not
portable to the original version of Unix _a_w_k.
--WW ggeenn--ppoo
----ggeenn--ppoo
Scan and parse the AWK program, and generate a GNU
..ppoo format file on standard output with entries for
all localizable strings in the program. The pro
gram itself is not executed. See the GNU _g_e_t_t_e_x_t
distribution for more information on ..ppoo files.
--WW nnoonn--ddeecciimmaall--ddaattaa
----nnoonn--ddeecciimmaall--ddaattaa
Recognize octal and hexadecimal values in input
data. _U_s_e _t_h_i_s _o_p_t_i_o_n _w_i_t_h _g_r_e_a_t _c_a_u_t_i_o_n_!
--WW ppoossiixx
----ppoossiixx
This turns on _c_o_m_p_a_t_i_b_i_l_i_t_y mode, with the follow
ing additional restrictions:
\\xx escape sequences are not recognized.
Only space and tab act as field separators when
FFSS is set to a single space, newline does not.
You cannot continue lines after ?? and ::.
The synonym ffuunncc for the keyword ffuunnccttiioonn is not
recognized.
The operators **** and ****== cannot be used in place
of ^^ and ^^==.
The fffflluusshh(()) function is not available.
--WW pprrooffiillee[==_p_r_o_f___f_i_l_e]
----pprrooffiillee[==_p_r_o_f___f_i_l_e]
Send profiling data to _p_r_o_f___f_i_l_e. The default is
aawwkkpprrooff..oouutt. When run with _g_a_w_k, the profile is
just a "pretty printed" version of the program.
When run with _p_g_a_w_k, the profile contains execution
counts of each statement in the program in the left
margin and function call counts for each user-
defined function.
--WW rree--iinntteerrvvaall
----rree--iinntteerrvvaall
Enable the use of _i_n_t_e_r_v_a_l _e_x_p_r_e_s_s_i_o_n_s in regular
expression matching (see RReegguullaarr EExxpprreessssiioonnss,
below). Interval expressions were not tradition
ally available in the AWK language. The POSIX
standard added them, to make _a_w_k and _e_g_r_e_p consis
tent with each other. However, their use is likely
to break old AWK programs, so _g_a_w_k only provides
them if they are requested with this option, or
when ----ppoossiixx is specified.
--WW ssoouurrccee _p_r_o_g_r_a_m_-_t_e_x_t
----ssoouurrccee _p_r_o_g_r_a_m_-_t_e_x_t
Use _p_r_o_g_r_a_m_-_t_e_x_t as AWK program source code. This
option allows the easy intermixing of library func
tions (used via the --ff and ----ffiillee options) with
source code entered on the command line. It is
intended primarily for medium to large AWK programs
used in shell scripts.
--WW vveerrssiioonn
----vveerrssiioonn
Print version information for this particular copy
of _g_a_w_k on the standard output. This is useful
mainly for knowing if the current copy of _g_a_w_k on
your system is up to date with respect to whatever
the Free Software Foundation is distributing. This
is also useful when reporting bugs. (Per the _G_N_U
_C_o_d_i_n_g _S_t_a_n_d_a_r_d_s, these options cause an immediate,
successful exit.)
---- Signal the end of options. This is useful to allow
further arguments to the AWK program itself to
start with a "-". This is mainly for consistency
with the argument parsing convention used by most
other POSIX programs.
In compatibility mode, any other options are flagged as
invalid, but are otherwise ignored. In normal operation,
as long as program text has been supplied, unknown options
are passed on to the AWK program in the AARRGGVV array for
processing. This is particularly useful for running AWK
programs via the "#!" executable interpreter mechanism.
AAWWKK PPRROOGGRRAAMM EEXXEECCUUTTIIOONN
An AWK program consists of a sequence of pattern-action
statements and optional function definitions.
_p_a_t_t_e_r_n {{ _a_c_t_i_o_n _s_t_a_t_e_m_e_n_t_s }}
ffuunnccttiioonn _n_a_m_e((_p_a_r_a_m_e_t_e_r _l_i_s_t)) {{ _s_t_a_t_e_m_e_n_t_s }}
_G_a_w_k first reads the program source from the _p_r_o_g_r_a_m_-
_f_i_l_e(s) if specified, from arguments to ----ssoouurrccee, or from
the first non-option argument on the command line. The --ff
and ----ssoouurrccee options may be used multiple times on the
command line. _G_a_w_k reads the program text as if all the
_p_r_o_g_r_a_m_-_f_i_l_es and command line source texts had been con
catenated together. This is useful for building libraries
of AWK functions, without having to include them in each
new AWK program that uses them. It also provides the
ability to mix library functions with command line pro
grams.
The environment variable AAWWKKPPAATTHH specifies a search path
to use when finding source files named with the --ff option.
If this variable does not exist, the default path is
""..:://uussrr//llooccaall//sshhaarree//aawwkk"". (The actual directory may vary,
depending upon how _g_a_w_k was built and installed.) If a
file name given to the --ff option contains a "/" character,
no path search is performed.
_G_a_w_k executes AWK programs in the following order. First,
all variable assignments specified via the --vv option are
performed. Next, _g_a_w_k compiles the program into an inter
nal form. Then, _g_a_w_k executes the code in the BBEEGGIINN
block(s) (if any), and then proceeds to read each file
named in the AARRGGVV array. If there are no files named on
the command line, _g_a_w_k reads the standard input.
If a filename on the command line has the form _v_a_r==_v_a_l it
is treated as a variable assignment. The variable _v_a_r
will be assigned the value _v_a_l. (This happens after any
BBEEGGIINN block(s) have been run.) Command line variable
assignment is most useful for dynamically assigning values
to the variables AWK uses to control how input is broken
into fields and records. It is also useful for control
ling state if multiple passes are needed over a single
data file.
If the value of a particular element of AARRGGVV is empty
(""""), _g_a_w_k skips over it.
For each record in the input, _g_a_w_k tests to see if it
matches any _p_a_t_t_e_r_n in the AWK program. For each pattern
that the record matches, the associated _a_c_t_i_o_n is exe
cuted. The patterns are tested in the order they occur in
the program.
Finally, after all the input is exhausted, _g_a_w_k executes
the code in the EENNDD block(s) (if any).
VVAARRIIAABBLLEESS,, RREECCOORRDDSS AANNDD FFIIEELLDDSS
AWK variables are dynamic; they come into existence when
they are first used. Their values are either floating-
point numbers or strings, or both, depending upon how they
are used. AWK also has one dimensional arrays; arrays
with multiple dimensions may be simulated. Several pre-
defined variables are set as a program runs; these will be
described as needed and summarized below.
RReeccoorrddss
Normally, records are separated by newline characters.
You can control how records are separated by assigning
values to the built-in variable RRSS. If RRSS is any single
character, that character separates records. Otherwise,
RRSS is a regular expression. Text in the input that
matches this regular expression separates the record.
However, in compatibility mode, only the first character
of its string value is used for separating records. If RRSS
is set to the null string, then records are separated by
blank lines. When RRSS is set to the null string, the new
line character always acts as a field separator, in addi
tion to whatever value FFSS may have.
FFiieellddss
As each input record is read, _g_a_w_k splits the record into
_f_i_e_l_d_s, using the value of the FFSS variable as the field
separator. If FFSS is a single character, fields are sepa
rated by that character. If FFSS is the null string, then
each individual character becomes a separate field. Oth
erwise, FFSS is expected to be a full regular expression.
In the special case that FFSS is a single space, fields are
separated by runs of spaces and/or tabs and/or newlines.
(But see the discussion of ----ppoossiixx, below). NNOOTTEE:: The
value of IIGGNNOORREECCAASSEE (see below) also affects how fields
are split when FFSS is a regular expression, and how records
are separated when RRSS is a regular expression.
If the FFIIEELLDDWWIIDDTTHHSS variable is set to a space separated
list of numbers, each field is expected to have fixed
width, and _g_a_w_k splits up the record using the specified
widths. The value of FFSS is ignored. Assigning a new
value to FFSS overrides the use of FFIIEELLDDWWIIDDTTHHSS, and restores
the default behavior.
Each field in the input record may be referenced by its
position, $$11, $$22, and so on. $$00 is the whole record.
Fields need not be referenced by constants:
nn == 55
pprriinntt $$nn
prints the fifth field in the input record.
The variable NNFF is set to the total number of fields in
the input record.
References to non-existent fields (i.e. fields after $$NNFF)
produce the null-string. However, assigning to a non-
existent field (e.g., $$((NNFF++22)) == 55) increases the value of
NNFF, creates any intervening fields with the null string as
their value, and causes the value of $$00 to be recomputed,
with the fields being separated by the value of OOFFSS. Ref
erences to negative numbered fields cause a fatal error.
Decrementing NNFF causes the values of fields past the new
value to be lost, and the value of $$00 to be recomputed,
with the fields being separated by the value of OOFFSS.
Assigning a value to an existing field causes the whole
record to be rebuilt when $$00 is referenced. Similarly,
assigning a value to $$00 causes the record to be resplit,
creating new values for the fields.
BBuuiilltt--iinn VVaarriiaabblleess
_G_a_w_k's built-in variables are:
AARRGGCC The number of command line arguments (does not
include options to _g_a_w_k, or the program
source).
AARRGGIINNDD The index in AARRGGVV of the current file being
processed.
AARRGGVV Array of command line arguments. The array is
indexed from 0 to AARRGGCC - 1. Dynamically
changing the contents of AARRGGVV can control the
files used for data.
BBIINNMMOODDEE On non-POSIX systems, specifies use of
"binary" mode for all file I/O. Numeric val
ues of 1, 2, or 3, specify that input files,
output files, or all files, respectively,
should use binary I/O. String values of ""rr"",
or ""ww"" specify that input files, or output
files, respectively, should use binary I/O.
String values of ""rrww"" or ""wwrr"" specify that all
files should use binary I/O. Any other string
value is treated as ""rrww"", but generates a
warning message.
CCOONNVVFFMMTT The conversion format for numbers, ""%%..66gg"", by
default.
EENNVVIIRROONN An array containing the values of the current
environment. The array is indexed by the
environment variables, each element being the
value of that variable (e.g., EENNVVIIRROONN[[""HHOOMMEE""]]
might be //hhoommee//aarrnnoolldd). Changing this array
does not affect the environment seen by pro
grams which _g_a_w_k spawns via redirection or the
ssyysstteemm(()) function.
EERRRRNNOO If a system error occurs either doing a redi
rection for ggeettlliinnee, during a read for ggeett
lliinnee, or during a cclloossee(()), then EERRRRNNOO will
contain a string describing the error. The
value is subject to translation in non-English
locales.
FFIIEELLDDWWIIDDTTHHSS A white-space separated list of fieldwidths.
When set, _g_a_w_k parses the input into fields of
fixed width, instead of using the value of the
FFSS variable as the field separator.
FFIILLEENNAAMMEE The name of the current input file. If no
files are specified on the command line, the
value of FFIILLEENNAAMMEE is "-". However, FFIILLEENNAAMMEE
is undefined inside the BBEEGGIINN block (unless
set by ggeettlliinnee).
FFNNRR The input record number in the current input
file.
FFSS The input field separator, a space by default.
See FFiieellddss, above.
IIGGNNOORREECCAASSEE Controls the case-sensitivity of all regular
expression and string operations. If IIGGNNOORREE
CCAASSEE has a non-zero value, then string compar
isons and pattern matching in rules, field
splitting with FFSS, record separating with RRSS,
regular expression matching with ~~ and !!~~, and
the ggeennssuubb(()), ggssuubb(()), iinnddeexx(()), mmaattcchh(()),
sspplliitt(()), and ssuubb(()) built-in functions all
ignore case when doing regular expression
operations. NNOOTTEE:: Array subscripting is _n_o_t
affected, nor is the aassoorrtt(()) function.
Thus, if IIGGNNOORREECCAASSEE is not equal to zero, //aaBB//
matches all of the strings ""aabb"", ""aaBB"", ""AAbb"",
and ""AABB"". As with all AWK variables, the ini
tial value of IIGGNNOORREECCAASSEE is zero, so all regu
lar expression and string operations are nor
mally case-sensitive. Under Unix, the full
ISO 8859-1 Latin-1 character set is used when
ignoring case.
LLIINNTT Provides dynamic control of the ----lliinntt option
from within an AWK program. When true, _g_a_w_k
prints lint warnings. When false, it does not.
When assigned the string value ""ffaattaall"", lint
warnings become fatal errors, exactly like
----lliinntt==ffaattaall. Any other true value just
prints warnings.
NNFF The number of fields in the current input
record.
NNRR The total number of input records seen so far.
OOFFMMTT The output format for numbers, ""%%..66gg"", by
default.
OOFFSS The output field separator, a space by
default.
OORRSS The output record separator, by default a new
line.
PPRROOCCIINNFFOO The elements of this array provide access to
information about the running AWK program. On
some systems, there may be elements in the
array, ""ggrroouupp11"" through ""ggrroouupp_n"" for some _n,
which is the number of supplementary groups
that the process has. Use the iinn operator to
test for these elements. The following ele
ments are guaranteed to be available:
PPRROOCCIINNFFOO[[""eeggiidd""]] the value of the _g_e_t_e_g_i_d(2)
system call.
PPRROOCCIINNFFOO[[""eeuuiidd""]] the value of the _g_e_t_e_u_i_d(2)
system call.
PPRROOCCIINNFFOO[[""FFSS""]] ""FFSS"" if field splitting
with FFSS is in effect, or
""FFIIEELLDDWWIIDDTTHHSS"" if field
splitting with FFIIEELLDDWWIIDDTTHHSS
is in effect.
PPRROOCCIINNFFOO[[""ggiidd""]] the value of the _g_e_t_g_i_d(2)
system call.
PPRROOCCIINNFFOO[[""ppggrrppiidd""]] the process group ID of the
current process.
PPRROOCCIINNFFOO[[""ppiidd""]] the process ID of the cur
rent process.
PPRROOCCIINNFFOO[[""ppppiidd""]] the parent process ID of
the current process.
PPRROOCCIINNFFOO[[""uuiidd""]] the value of the _g_e_t_u_i_d(2)
system call.
RRSS The input record separator, by default a new
line.
RRTT The record terminator. _G_a_w_k sets RRTT to the
input text that matched the character or regu
lar expression specified by RRSS.
RRSSTTAARRTT The index of the first character matched by
mmaattcchh(()); 0 if no match. (This implies that
character indices start at one.)
RRLLEENNGGTTHH The length of the string matched by mmaattcchh(());
-1 if no match.
SSUUBBSSEEPP The character used to separate multiple sub
scripts in array elements, by default ""\\003344"".
TTEEXXTTDDOOMMAAIINN The text domain of the AWK program; used to
find the localized translations for the pro
gram's strings.
AArrrraayyss
Arrays are subscripted with an expression between square
brackets ([[ and ]]). If the expression is an expression
list (_e_x_p_r, _e_x_p_r ...) then the array subscript is a
string consisting of the concatenation of the (string)
value of each expression, separated by the value of the
SSUUBBSSEEPP variable. This facility is used to simulate multi
ply dimensioned arrays. For example:
ii == ""AA"";; jj == ""BB"";; kk == ""CC""
xx[[ii,, jj,, kk]] == ""hheelllloo,, wwoorrlldd\\nn""
assigns the string ""hheelllloo,, wwoorrlldd\\nn"" to the element of the
array xx which is indexed by the string ""AA\\003344BB\\003344CC"". All
arrays in AWK are associative, i.e. indexed by string val
ues.
The special operator iinn may be used in an iiff or wwhhiillee
statement to see if an array has an index consisting of a
particular value.
iiff ((vvaall iinn aarrrraayy))
pprriinntt aarrrraayy[[vvaall]]
If the array has multiple subscripts, use ((ii,, jj)) iinn aarrrraayy.
The iinn construct may also be used in a ffoorr loop to iterate
over all the elements of an array.
An element may be deleted from an array using the ddeelleettee
statement. The ddeelleettee statement may also be used to
delete the entire contents of an array, just by specifying
the array name without a subscript.
VVaarriiaabbllee TTyyppiinngg AAnndd CCoonnvveerrssiioonn
Variables and fields may be (floating point) numbers, or
strings, or both. How the value of a variable is inter
preted depends upon its context. If used in a numeric
expression, it will be treated as a number, if used as a
string it will be treated as a string.
To force a variable to be treated as a number, add 0 to
it; to force it to be treated as a string, concatenate it
with the null string.
When a string must be converted to a number, the conver
sion is accomplished using _s_t_r_t_o_d(3). A number is con
verted to a string by using the value of CCOONNVVFFMMTT as a for
mat string for _s_p_r_i_n_t_f(3), with the numeric value of the
variable as the argument. However, even though all num
bers in AWK are floating-point, integral values are _a_l_w_a_y_s
converted as integers. Thus, given
CCOONNVVFFMMTT == ""%%22..22ff""
aa == 1122
bb == aa """"
the variable bb has a string value of ""1122"" and not ""1122..0000"".
_G_a_w_k performs comparisons as follows: If two variables are
numeric, they are compared numerically. If one value is
numeric and the other has a string value that is a
"numeric string," then comparisons are also done numeri
cally. Otherwise, the numeric value is converted to a
string and a string comparison is performed. Two strings
are compared, of course, as strings. Note that the POSIX
standard applies the concept of "numeric string" every
where, even to string constants. However, this is clearly
incorrect, and _g_a_w_k does not do this. (Fortunately, this
is fixed in the next version of the standard.)
Note that string constants, such as ""5577"", are _n_o_t numeric
strings, they are string constants. The idea of "numeric
string" only applies to fields, ggeettlliinnee input, FFIILLEENNAAMMEE,
AARRGGVV elements, EENNVVIIRROONN elements and the elements of an
array created by sspplliitt(()) that are numeric strings. The
basic idea is that _u_s_e_r _i_n_p_u_t, and only user input, that
looks numeric, should be treated that way.
Uninitialized variables have the numeric value 0 and the
string value "" (the null, or empty, string).
OOccttaall aanndd HHeexxaaddeecciimmaall CCoonnssttaannttss
Starting with version 3.1 of _g_a_w_k _, you may use C-style
octal and hexadecimal constants in your AWK program source
code. For example, the octal value 001111 is equal to deci
mal 99, and the hexadecimal value 00xx1111 is equal to decimal
17.
SSttrriinngg CCoonnssttaannttss
String constants in AWK are sequences of characters
enclosed between double quotes (""). Within strings, cer
tain _e_s_c_a_p_e _s_e_q_u_e_n_c_e_s are recognized, as in C. These are:
\\\\ A literal backslash.
\\aa The "alert" character; usually the ASCII BEL charac
ter.
\\bb backspace.
\\ff form-feed.
\\nn newline.
\\rr carriage return.
\\tt horizontal tab.
\\vv vertical tab.
\\xx_h_e_x _d_i_g_i_t_s
The character represented by the string of hexadeci
mal digits following the \\xx. As in ANSI C, all fol
lowing hexadecimal digits are considered part of the
escape sequence. (This feature should tell us some
thing about language design by committee.) E.g.,
""\\xx11BB"" is the ASCII ESC (escape) character.
\\_d_d_d The character represented by the 1-, 2-, or 3-digit
sequence of octal digits. E.g., ""\\003333"" is the ASCII
ESC (escape) character.
\\_c The literal character _c.
The escape sequences may also be used inside constant reg
ular expressions (e.g., //[[ \\tt\\ff\\nn\\rr\\vv]]// matches whitespace
characters).
In compatibility mode, the characters represented by octal
and hexadecimal escape sequences are treated literally
when used in regular expression constants. Thus, //aa\\5522bb//
is equivalent to //aa\\**bb//.
PPAATTTTEERRNNSS AANNDD AACCTTIIOONNSS
AWK is a line-oriented language. The pattern comes first,
and then the action. Action statements are enclosed in {{
and }}. Either the pattern may be missing, or the action
may be missing, but, of course, not both. If the pattern
is missing, the action is executed for every single record
of input. A missing action is equivalent to
{{ pprriinntt }}
which prints the entire record.
Comments begin with the "#" character, and continue until
the end of the line. Blank lines may be used to separate
statements. Normally, a statement ends with a newline,
however, this is not the case for lines ending in a ",",
{{, ??, ::, &&&&, or ||||. Lines ending in ddoo or eellssee also have
their statements automatically continued on the following
line. In other cases, a line can be continued by ending
it with a "\", in which case the newline will be ignored.
Multiple statements may be put on one line by separating
them with a ";". This applies to both the statements
within the action part of a pattern-action pair (the usual
case), and to the pattern-action statements themselves.
PPaatttteerrnnss
AWK patterns may be one of the following:
BBEEGGIINN
EENNDD
//_r_e_g_u_l_a_r _e_x_p_r_e_s_s_i_o_n//
_r_e_l_a_t_i_o_n_a_l _e_x_p_r_e_s_s_i_o_n
_p_a_t_t_e_r_n &&&& _p_a_t_t_e_r_n
_p_a_t_t_e_r_n |||| _p_a_t_t_e_r_n
_p_a_t_t_e_r_n ?? _p_a_t_t_e_r_n :: _p_a_t_t_e_r_n
((_p_a_t_t_e_r_n))
!! _p_a_t_t_e_r_n
_p_a_t_t_e_r_n_1,, _p_a_t_t_e_r_n_2
BBEEGGIINN and EENNDD are two special kinds of patterns which are
not tested against the input. The action parts of all
BBEEGGIINN patterns are merged as if all the statements had
been written in a single BBEEGGIINN block. They are executed
before any of the input is read. Similarly, all the EENNDD
blocks are merged, and executed when all the input is
exhausted (or when an eexxiitt statement is executed). BBEEGGIINN
and EENNDD patterns cannot be combined with other patterns in
pattern expressions. BBEEGGIINN and EENNDD patterns cannot have
missing action parts.
For //_r_e_g_u_l_a_r _e_x_p_r_e_s_s_i_o_n// patterns, the associated state
ment is executed for each input record that matches the
regular expression. Regular expressions are the same as
those in _e_g_r_e_p(1), and are summarized below.
A _r_e_l_a_t_i_o_n_a_l _e_x_p_r_e_s_s_i_o_n may use any of the operators
defined below in the section on actions. These generally
test whether certain fields match certain regular expres
sions.
The &&&&, ||||, and !! operators are logical AND, logical OR,
and logical NOT, respectively, as in C. They do short-
circuit evaluation, also as in C, and are used for combin
ing more primitive pattern expressions. As in most lan
guages, parentheses may be used to change the order of
evaluation.
The ??:: operator is like the same operator in C. If the
first pattern is true then the pattern used for testing is
the second pattern, otherwise it is the third. Only one
of the second and third patterns is evaluated.
The _p_a_t_t_e_r_n_1,, _p_a_t_t_e_r_n_2 form of an expression is called a
_r_a_n_g_e _p_a_t_t_e_r_n. It matches all input records starting with
a record that matches _p_a_t_t_e_r_n_1, and continuing until a
record that matches _p_a_t_t_e_r_n_2, inclusive. It does not com
bine with any other sort of pattern expression.
RReegguullaarr EExxpprreessssiioonnss
Regular expressions are the extended kind found in _e_g_r_e_p.
They are composed of characters as follows:
_c matches the non-metacharacter _c.
_\_c matches the literal character _c.
.. matches any character _i_n_c_l_u_d_i_n_g newline.
^^ matches the beginning of a string.
$$ matches the end of a string.
[[_a_b_c_._._.]] character list, matches any of the characters
_a_b_c_._._..
[[^^_a_b_c_._._.]] negated character list, matches any character
except _a_b_c_._._..
_r_1||_r_2 alternation: matches either _r_1 or _r_2.
_r_1_r_2 concatenation: matches _r_1, and then _r_2.
_r++ matches one or more _r's.
_r** matches zero or more _r's.
_r?? matches zero or one _r's.
((_r)) grouping: matches _r.
_r{{_n}}
_r{{_n,,}}
_r{{_n,,_m}} One or two numbers inside braces denote an
_i_n_t_e_r_v_a_l _e_x_p_r_e_s_s_i_o_n. If there is one number in
the braces, the preceding regular expression _r
is repeated _n times. If there are two numbers
separated by a comma, _r is repeated _n to _m
times. If there is one number followed by a
comma, then _r is repeated at least _n times.
Interval expressions are only available if
either ----ppoossiixx or ----rree--iinntteerrvvaall is specified on
the command line.
\\yy matches the empty string at either the begin
ning or the end of a word.
\\BB matches the empty string within a word.
\\<< matches the empty string at the beginning of a
word.
\\>> matches the empty string at the end of a word.
\\ww matches any word-constituent character (letter,
digit, or underscore).
\\WW matches any character that is not word-con
stituent.
\\`` matches the empty string at the beginning of a
buffer (string).
\\'' matches the empty string at the end of a
buffer.
The escape sequences that are valid in string constants
(see below) are also valid in regular expressions.
_C_h_a_r_a_c_t_e_r _c_l_a_s_s_e_s are a new feature introduced in the
POSIX standard. A character class is a special notation
for describing lists of characters that have a specific
attribute, but where the actual characters themselves can
vary from country to country and/or from character set to
character set. For example, the notion of what is an
alphabetic character differs in the USA and in France.
A character class is only valid in a regular expression
_i_n_s_i_d_e the brackets of a character list. Character
classes consist of [[::, a keyword denoting the class, and
::]]. The character classes defined by the POSIX standard
are:
[[::aallnnuumm::]] Alphanumeric characters.
[[::aallpphhaa::]] Alphabetic characters.
[[::bbllaannkk::]] Space or tab characters.
[[::ccnnttrrll::]] Control characters.
[[::ddiiggiitt::]] Numeric characters.
[[::ggrraapphh::]] Characters that are both printable and visible.
(A space is printable, but not visible, while
an aa is both.)
[[::lloowweerr::]] Lower-case alphabetic characters.
[[::pprriinntt::]] Printable characters (characters that are not
control characters.)
[[::ppuunncctt::]] Punctuation characters (characters that are not
letter, digits, control characters, or space
characters).
[[::ssppaaccee::]] Space characters (such as space, tab, and form
feed, to name a few).
[[::uuppppeerr::]] Upper-case alphabetic characters.
[[::xxddiiggiitt::]] Characters that are hexadecimal digits.
For example, before the POSIX standard, to match alphanu
meric characters, you would have had to write
//[[AA--ZZaa--zz00--99]]//. If your character set had other alphabetic
characters in it, this would not match them, and if your
character set collated differently from ASCII, this might
not even match the ASCII alphanumeric characters. With
the POSIX character classes, you can write //[[[[::aallnnuumm::]]]]//,
and this matches the alphabetic and numeric characters in
your character set.
Two additional special sequences can appear in character
lists. These apply to non-ASCII character sets, which can
have single symbols (called _c_o_l_l_a_t_i_n_g _e_l_e_m_e_n_t_s) that are
represented with more than one character, as well as sev
eral characters that are equivalent for _c_o_l_l_a_t_i_n_g, or
sorting, purposes. (E.g., in French, a plain "e" and a
grave-accented e` are equivalent.)
Collating Symbols
A collating symbol is a multi-character collating
element enclosed in [[.. and ..]]. For example, if cchh
is a collating element, then [[[[..cchh..]]]] is a regular
expression that matches this collating element,
while [[cchh]] is a regular expression that matches
either cc or hh.
Equivalence Classes
An equivalence class is a locale-specific name for
a list of characters that are equivalent. The name
is enclosed in [[== and ==]]. For example, the name ee
might be used to represent all of "e," "e," and
"e`." In this case, [[[[==ee==]]]] is a regular expression
that matches any of ee,,eeor ee``.
These features are very valuable in non-English speaking
locales. The library functions that _g_a_w_k uses for regular
expression matching currently only recognize POSIX charac
ter classes; they do not recognize collating symbols or
equivalence classes.
The \\yy, \\BB, \\<<, \\>>, \\ww, \\WW, \\``, and \\'' operators are spe
cific to _g_a_w_k; they are extensions based on facilities in
the GNU regular expression libraries.
The various command line options control how _g_a_w_k inter
prets characters in regular expressions.
No options
In the default case, _g_a_w_k provide all the facili
ties of POSIX regular expressions and the GNU regu
lar expression operators described above. However,
interval expressions are not supported.
----ppoossiixx
Only POSIX regular expressions are supported, the
GNU operators are not special. (E.g., \\ww matches a
literal ww). Interval expressions are allowed.
----ttrraaddiittiioonnaall
Traditional Unix _a_w_k regular expressions are
matched. The GNU operators are not special, inter
val expressions are not available, and neither are
the POSIX character classes ([[[[::aallnnuumm::]]]] and so
on). Characters described by octal and hexadecimal
escape sequences are treated literally, even if
they represent regular expression metacharacters.
----rree--iinntteerrvvaall
Allow interval expressions in regular expressions,
even if ----ttrraaddiittiioonnaall has been provided.
AAccttiioonnss
Action statements are enclosed in braces, {{ and }}. Action
statements consist of the usual assignment, conditional,
and looping statements found in most languages. The oper
ators, control statements, and input/output statements
available are patterned after those in C.
OOppeerraattoorrss
The operators in AWK, in order of decreasing precedence,
are
((...)) Grouping
$$ Field reference.
++++ ---- Increment and decrement, both prefix and post
fix.
^^ Exponentiation (**** may also be used, and ****==
for the assignment operator).
++ -- !! Unary plus, unary minus, and logical negation.
** // %% Multiplication, division, and modulus.
++ -- Addition and subtraction.
_s_p_a_c_e String concatenation.
<< >>
<<== >>==
!!== ==== The regular relational operators.
~~ !!~~ Regular expression match, negated match.
NNOOTTEE:: Do not use a constant regular expression
(//ffoooo//) on the left-hand side of a ~~ or !!~~.
Only use one on the right-hand side. The
expression //ffoooo// ~~ _e_x_p has the same meaning as
(((($$00 ~~ //ffoooo//)) ~~ _e_x_p)). This is usually _n_o_t
what was intended.
iinn Array membership.
&&&& Logical AND.
|||| Logical OR.
??:: The C conditional expression. This has the
form _e_x_p_r_1 ?? _e_x_p_r_2 :: _e_x_p_r_3. If _e_x_p_r_1 is true,
the value of the expression is _e_x_p_r_2, other
wise it is _e_x_p_r_3. Only one of _e_x_p_r_2 and _e_x_p_r_3
is evaluated.
== ++== --==
**== //== %%== ^^== Assignment. Both absolute assignment ((_v_a_r ==
_v_a_l_u_e)) and operator-assignment (the other
forms) are supported.
CCoonnttrrooll SSttaatteemmeennttss
The control statements are as follows:
iiff ((_c_o_n_d_i_t_i_o_n)) _s_t_a_t_e_m_e_n_t [ eellssee _s_t_a_t_e_m_e_n_t ]
wwhhiillee ((_c_o_n_d_i_t_i_o_n)) _s_t_a_t_e_m_e_n_t
ddoo _s_t_a_t_e_m_e_n_t wwhhiillee ((_c_o_n_d_i_t_i_o_n))
ffoorr ((_e_x_p_r_1;; _e_x_p_r_2;; _e_x_p_r_3)) _s_t_a_t_e_m_e_n_t
ffoorr ((_v_a_r iinn _a_r_r_a_y)) _s_t_a_t_e_m_e_n_t
bbrreeaakk
ccoonnttiinnuuee
ddeelleettee _a_r_r_a_y[[_i_n_d_e_x]]
ddeelleettee _a_r_r_a_y
eexxiitt [ _e_x_p_r_e_s_s_i_o_n ]
{{ _s_t_a_t_e_m_e_n_t_s }}
II//OO SSttaatteemmeennttss
The input/output statements are as follows:
cclloossee((_f_i_l_e [,, _h_o_w])) Close file, pipe or co-process. The
optional _h_o_w should only be used
when closing one end of a two-way
pipe to a co-process. It must be a
string value, either ""ttoo"" or ""ffrroomm"".
ggeettlliinnee Set $$00 from next input record; set
NNFF, NNRR, FFNNRR.
ggeettlliinnee <<_f_i_l_e Set $$00 from next record of _f_i_l_e; set
NNFF.
ggeettlliinnee _v_a_r Set _v_a_r from next input record; set
NNRR, FFNNRR.
ggeettlliinnee _v_a_r <<_f_i_l_e Set _v_a_r from next record of _f_i_l_e.
_c_o_m_m_a_n_d || ggeettlliinnee [_v_a_r]
Run _c_o_m_m_a_n_d piping the output either
into $$00 or _v_a_r, as above.
_c_o_m_m_a_n_d ||&& ggeettlliinnee [_v_a_r]
Run _c_o_m_m_a_n_d as a co-process piping
the output either into $$00 or _v_a_r, as
above. Co-processes are a _g_a_w_k
extension.
nneexxtt Stop processing the current input
record. The next input record is
read and processing starts over with
the first pattern in the AWK pro
gram. If the end of the input data
is reached, the EENNDD block(s), if
any, are executed.
nneexxttffiillee Stop processing the current input
file. The next input record read
comes from the next input file.
FFIILLEENNAAMMEE and AARRGGIINNDD are updated, FFNNRR
is reset to 1, and processing starts
over with the first pattern in the
AWK program. If the end of the input
data is reached, the EENNDD block(s),
if any, are executed.
pprriinntt Prints the current record. The out
put record is terminated with the
value of the OORRSS variable.
pprriinntt _e_x_p_r_-_l_i_s_t Prints expressions. Each expression
is separated by the value of the OOFFSS
variable. The output record is ter
minated with the value of the OORRSS
variable.
pprriinntt _e_x_p_r_-_l_i_s_t >>_f_i_l_e Prints expressions on _f_i_l_e. Each
expression is separated by the value
of the OOFFSS variable. The output
record is terminated with the value
of the OORRSS variable.
pprriinnttff _f_m_t_, _e_x_p_r_-_l_i_s_t Format and print.
pprriinnttff _f_m_t_, _e_x_p_r_-_l_i_s_t >>_f_i_l_e
Format and print on _f_i_l_e.
ssyysstteemm((_c_m_d_-_l_i_n_e)) Execute the command _c_m_d_-_l_i_n_e, and
return the exit status. (This may
not be available on non-POSIX sys
tems.)
fffflluusshh(([_f_i_l_e])) Flush any buffers associated with
the open output file or pipe _f_i_l_e.
If _f_i_l_e is missing, then standard
output is flushed. If _f_i_l_e is the
null string, then all open output
files and pipes have their buffers
flushed.
Additional output redirections are allowed for pprriinntt and
pprriinnttff.
pprriinntt ...... >>>> _f_i_l_e
appends output to the _f_i_l_e.
pprriinntt ...... || _c_o_m_m_a_n_d
writes on a pipe.
pprriinntt ...... ||&& _c_o_m_m_a_n_d
sends data to a co-process.
The ggeettlliinnee command returns 0 on end of file and -1 on an
error. Upon an error, EERRRRNNOO contains a string describing
the problem.
NNOOTTEE:: If using a pipe or co-process to ggeettlliinnee, or from
pprriinntt or pprriinnttff within a loop, you _m_u_s_t use cclloossee(()) to
create new instances of the command. AWK does not auto
matically close pipes or co-processes when they return
EOF.
TThhee _p_r_i_n_t_f SSttaatteemmeenntt
The AWK versions of the pprriinnttff statement and sspprriinnttff(())
function (see below) accept the following conversion
specification formats:
%%cc An ASCII character. If the argument used for %%cc
is numeric, it is treated as a character and
printed. Otherwise, the argument is assumed to be
a string, and the only first character of that
string is printed.
%%dd, %%ii A decimal number (the integer part).
%%ee ,, %%EE
A floating point number of the form
[[--]]dd..ddddddddddddee[[++--]]dddd. The %%EE format uses EE instead
of ee.
%%ff A floating point number of the form [[--]]dddddd..dddddddddddd.
%%gg ,, %%GG
Use %%ee or %%ff conversion, whichever is shorter,
with nonsignificant zeros suppressed. The %%GG for
mat uses %%EE instead of %%ee.
%%oo An unsigned octal number (also an integer).
%%uu An unsigned decimal number (again, an integer).
%%ss A character string.
%%xx ,, %%XX
An unsigned hexadecimal number (an integer). The
%%XX format uses AABBCCDDEEFF instead of aabbccddeeff.
%%%% A single %% character; no argument is converted.
Optional, additional parameters may lie between the %% and
the control letter:
_c_o_u_n_t$$ Use the _c_o_u_n_t'th argument at this point in the for
matting. This is called a _p_o_s_i_t_i_o_n_a_l _s_p_e_c_i_f_i_e_r and
is intended primarily for use in translated ver
sions of format strings, not in the original text
of an AWK program. It is a _g_a_w_k extension.
-- The expression should be left-justified within its
field.
_s_p_a_c_e For numeric conversions, prefix positive values
with a space, and negative values with a minus
sign.
++ The plus sign, used before the width modifier (see
below), says to always supply a sign for numeric
conversions, even if the data to be formatted is
positive. The ++ overrides the space modifier.
## Use an "alternate form" for certain control let
ters. For %%oo, supply a leading zero. For %%xx, and
%%XX, supply a leading 00xx or 00XX for a nonzero result.
For %%ee, %%EE, and %%ff, the result always contains a
decimal point. For %%gg, and %%GG, trailing zeros are
not removed from the result.
00 A leading 00 (zero) acts as a flag, that indicates
output should be padded with zeroes instead of
spaces. This applies even to non-numeric output
formats. This flag only has an effect when the
field width is wider than the value to be printed.
_w_i_d_t_h The field should be padded to this width. The
field is normally padded with spaces. If the 00
flag has been used, it is padded with zeroes.
.._p_r_e_c A number that specifies the precision to use when
printing. For the %%ee, %%EE, and %%ff formats, this
specifies the number of digits you want printed to
the right of the decimal point. For the %%gg, and %%GG
formats, it specifies the maximum number of signif
icant digits. For the %%dd, %%oo, %%ii, %%uu, %%xx, and %%XX
formats, it specifies the minimum number of digits
to print. For %%ss, it specifies the maximum number
of characters from the string that should be
printed.
The dynamic _w_i_d_t_h and _p_r_e_c capabilities of the ANSI C
pprriinnttff(()) routines are supported. A ** in place of either
the wwiiddtthh or pprreecc specifications causes their values to be
taken from the argument list to pprriinnttff or sspprriinnttff(()). To
use a positional specifier with a dynamic width or preci
sion, supply the _c_o_u_n_t$$ after the ** in the format string.
For example, ""%%33$$**22$$..**11$$ss"".
SSppeecciiaall FFiillee NNaammeess
When doing I/O redirection from either pprriinntt or pprriinnttff
into a file, or via ggeettlliinnee from a file, _g_a_w_k recognizes
certain special filenames internally. These filenames
allow access to open file descriptors inherited from
_g_a_w_k's parent process (usually the shell). These file
names may also be used on the command line to name data
files. The filenames are:
//ddeevv//ssttddiinn The standard input.
//ddeevv//ssttddoouutt The standard output.
//ddeevv//ssttddeerrrr The standard error output.
//ddeevv//ffdd//_n The file associated with the open file
descriptor _n.
These are particularly useful for error messages. For
example:
pprriinntt ""YYoouu bblleeww iitt!!"" >> ""//ddeevv//ssttddeerrrr""
whereas you would otherwise have to use
pprriinntt ""YYoouu bblleeww iitt!!"" || ""ccaatt 11>>&&22""
The following special filenames may be used with the ||&&
co-process operator for creating TCP/IP network connec
tions.
//iinneett//ttccpp//_l_p_o_r_t//_r_h_o_s_t//_r_p_o_r_t File for TCP/IP connection on
local port _l_p_o_r_t to remote
host _r_h_o_s_t on remote port
_r_p_o_r_t. Use a port of 00 to
have the system pick a port.
//iinneett//uuddpp//_l_p_o_r_t//_r_h_o_s_t//_r_p_o_r_t Similar, but use UDP/IP
instead of TCP/IP.
//iinneett//rraaww//_l_p_o_r_t//_r_h_o_s_t//_r_p_o_r_t Reserved for future use.
Other special filenames provide access to information
about the running _g_a_w_k process. TThheessee ffiilleennaammeess aarree nnooww
oobbssoolleettee.. Use the PPRROOCCIINNFFOO array to obtain the informa
tion they provide. The filenames are:
//ddeevv//ppiidd Reading this file returns the process ID of
the current process, in decimal, terminated
with a newline.
//ddeevv//ppppiidd Reading this file returns the parent process
ID of the current process, in decimal, termi
nated with a newline.
//ddeevv//ppggrrppiidd Reading this file returns the process group ID
of the current process, in decimal, terminated
with a newline.
//ddeevv//uusseerr Reading this file returns a single record ter
minated with a newline. The fields are sepa
rated with spaces. $$11 is the value of the
_g_e_t_u_i_d(2) system call, $$22 is the value of the
_g_e_t_e_u_i_d(2) system call, $$33 is the value of the
_g_e_t_g_i_d(2) system call, and $$44 is the value of
the _g_e_t_e_g_i_d(2) system call. If there are any
additional fields, they are the group IDs
returned by _g_e_t_g_r_o_u_p_s(2). Multiple groups may
not be supported on all systems.
NNuummeerriicc FFuunnccttiioonnss
AWK has the following built-in arithmetic functions:
aattaann22((_y,, _x)) Returns the arctangent of _y_/_x in radians.
ccooss((_e_x_p_r)) Returns the cosine of _e_x_p_r, which is in
radians.
eexxpp((_e_x_p_r)) The exponential function.
iinntt((_e_x_p_r)) Truncates to integer.
lloogg((_e_x_p_r)) The natural logarithm function.
rraanndd(()) Returns a random number between 0 and 1.
ssiinn((_e_x_p_r)) Returns the sine of _e_x_p_r, which is in radi
ans.
ssqqrrtt((_e_x_p_r)) The square root function.
ssrraanndd(([_e_x_p_r])) Uses _e_x_p_r as a new seed for the random num
ber generator. If no _e_x_p_r is provided, the
time of day is used. The return value is
the previous seed for the random number gen
erator.
SSttrriinngg FFuunnccttiioonnss
_G_a_w_k has the following built-in string functions:
aassoorrtt((_s [,, _d])) Returns the number of elements in
the source array _s. The contents
of _s are sorted using _g_a_w_k's nor
mal rules for comparing values,
and the indexes of the sorted
values of _s are replaced with
sequential integers starting with
1. If the optional destination
array _d is specified, then _s is
first duplicated into _d, and then
_d is sorted, leaving the indexes
of the source array _s unchanged.
ggeennssuubb((_r,, _s,, _h [,, _t])) Search the target string _t for
matches of the regular expression
_r. If _h is a string beginning
with gg or GG, then replace all
matches of _r with _s. Otherwise, _h
is a number indicating which match
of _r to replace. If _t is not sup
plied, $$00 is used instead. Within
the replacement text _s, the
sequence \\_n, where _n is a digit
from 1 to 9, may be used to indi
cate just the text that matched
the _n'th parenthesized subexpres
sion. The sequence \\00 represents
the entire matched text, as does
the character &&. Unlike ssuubb(()) and
ggssuubb(()), the modified string is
returned as the result of the
function, and the original target
string is _n_o_t changed.
ggssuubb((_r,, _s [,, _t])) For each substring matching the
regular expression _r in the string
_t, substitute the string _s, and
return the number of substitu
tions. If _t is not supplied, use
$$00. An && in the replacement text
is replaced with the text that was
actually matched. Use \\&& to get a
literal &&. (This must be typed as
""\\\\&&""; see _G_A_W_K_: _E_f_f_e_c_t_i_v_e _A_W_K
_P_r_o_g_r_a_m_m_i_n_g for a fuller discus
sion of the rules for &&''ss and
backslashes in the replacement
text of ssuubb(()), ggssuubb(()), and ggeenn
ssuubb(()).)
iinnddeexx((_s,, _t)) Returns the index of the string _t
in the string _s, or 0 if _t is not
present. (This implies that char
acter indices start at one.)
lleennggtthh(([_s])) Returns the length of the string
_s, or the length of $$00 if _s is not
supplied.
mmaattcchh((_s,, _r [,, _a])) Returns the position in _s where
the regular expression _r occurs,
or 0 if _r is not present, and sets
the values of RRSSTTAARRTT and RRLLEENNGGTTHH.
Note that the argument order is
the same as for the ~~ operator:
_s_t_r ~~ _r_e. If array _a is provided,
_a is cleared and then elements 1
through _n are filled with the por
tions of _s that match the corre
sponding parenthesized subexpres
sion in _r. The 0'th element of _a
contains the portion of _s matched
by the entire regular expression
_r.
sspplliitt((_s,, _a [,, _r])) Splits the string _s into the array
_a on the regular expression _r, and
returns the number of fields. If
_r is omitted, FFSS is used instead.
The array _a is cleared first.
Splitting behaves identically to
field splitting, described above.
sspprriinnttff((_f_m_t,, _e_x_p_r_-_l_i_s_t)) Prints _e_x_p_r_-_l_i_s_t according to _f_m_t,
and returns the resulting string.
ssttrrttoonnuumm((_s_t_r)) Examines _s_t_r, and returns its
numeric value. If _s_t_r begins with
a leading 00, ssttrrttoonnuumm(()) assumes
that _s_t_r is an octal number. If
_s_t_r begins with a leading 00xx or
00XX, ssttrrttoonnuumm(()) assumes that _s_t_r is
a hexadecimal number.
ssuubb((_r,, _s [,, _t])) Just like ggssuubb(()), but only the
first matching substring is
replaced.
ssuubbssttrr((_s,, _i [,, _n])) Returns the at most _n-character
substring of _s starting at _i. If
_n is omitted, the rest of _s is
used.
ttoolloowweerr((_s_t_r)) Returns a copy of the string _s_t_r,
with all the upper-case characters
in _s_t_r translated to their corre
sponding lower-case counterparts.
Non-alphabetic characters are left
unchanged.
ttoouuppppeerr((_s_t_r)) Returns a copy of the string _s_t_r,
with all the lower-case characters
in _s_t_r translated to their corre
sponding upper-case counterparts.
Non-alphabetic characters are left
unchanged.
TTiimmee FFuunnccttiioonnss
Since one of the primary uses of AWK programs is process
ing log files that contain time stamp information, _g_a_w_k
provides the following functions for obtaining time stamps
and formatting them.
mmkkttiimmee((_d_a_t_e_s_p_e_c))
Rurns _d_a_t_e_s_p_e_c into a time stamp of the same
form as returned by ssyyssttiimmee(()). The _d_a_t_e_s_p_e_c is
a string of the form _Y_Y_Y_Y _M_M _D_D _H_H _M_M _S_S_[ _D_S_T_].
The contents of the string are six or seven num
bers representing respectively the full year
including century, the month from 1 to 12, the
day of the month from 1 to 31, the hour of the
day from 0 to 23, the minute from 0 to 59, and
the second from 0 to 60, and an optional day
light saving flag. The values of these numbers
need not be within the ranges specified; for
example, an hour of -1 means 1 hour before
midnight. The origin-zero Gregorian calendar is
assumed, with year 0 preceding year 1 and year
-1 preceding year 0. The time is assumed to be
in the local timezone. If the daylight saving
flag is positive, the time is assumed to be day
light saving time; if zero, the time is assumed
to be standard time; and if negative (the
default), mmkkttiimmee(()) attempts to determine whether
daylight saving time is in effect for the speci
fied time. If _d_a_t_e_s_p_e_c does not contain enough
elements or if the resulting time is out of
range, mmkkttiimmee(()) returns -1.
ssttrrffttiimmee(([_f_o_r_m_a_t [,, _t_i_m_e_s_t_a_m_p]]))
Formats _t_i_m_e_s_t_a_m_p according to the specification
in _f_o_r_m_a_t_. The _t_i_m_e_s_t_a_m_p should be of the same
form as returned by ssyyssttiimmee(()). If _t_i_m_e_s_t_a_m_p is
missing, the current time of day is used. If
_f_o_r_m_a_t is missing, a default format equivalent
to the output of _d_a_t_e(1) is used. See the spec
ification for the ssttrrffttiimmee(()) function in ANSI C
for the format conversions that are guaranteed
to be available. A public-domain version of
_s_t_r_f_t_i_m_e(3) and a man page for it come with
_g_a_w_k; if that version was used to build _g_a_w_k,
then all of the conversions described in that
man page are available to _g_a_w_k_.
ssyyssttiimmee(()) Returns the current time of day as the number of
seconds since the Epoch (1970-01-01 00:00:00 UTC
on POSIX systems).
BBiitt MMaanniippuullaattiioonnss FFuunnccttiioonnss
Starting with version 3.1 of _g_a_w_k, the following bit
manipulation functions are available. They work by con
verting double-precision floating point values to uunnssiiggnneedd
lloonngg integers, doing the operation, and then converting
the result back to floating point. The functions are:
aanndd((_v_1,, _v_2)) Return the bitwise AND of the values
provided by _v_1 and _v_2.
ccoommppll((_v_a_l)) Return the bitwise complement of _v_a_l.
llsshhiifftt((_v_a_l,, _c_o_u_n_t)) Return the value of _v_a_l, shifted left
by _c_o_u_n_t bits.
oorr((_v_1,, _v_2)) Return the bitwise OR of the values
provided by _v_1 and _v_2.
rrsshhiifftt((_v_a_l,, _c_o_u_n_t)) Return the value of _v_a_l, shifted right
by _c_o_u_n_t bits.
xxoorr((_v_1,, _v_2)) Return the bitwise XOR of the values
provided by _v_1 and _v_2.
IInntteerrnnaattiioonnaalliizzaattiioonn FFuunnccttiioonnss
Starting with version 3.1 of _g_a_w_k, the following functions
may be used from within your AWK program for translating
strings at run-time. For full details, see _G_A_W_K_: _E_f_f_e_c_
_t_i_v_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g.
bbiinnddtteexxttddoommaaiinn((_d_i_r_e_c_t_o_r_y [,, _d_o_m_a_i_n]))
Specifies the directory where _g_a_w_k looks for the
..mmoo files, in case they will not or cannot be
placed in the ``standard'' locations (e.g., during
testing). It returns the directory where _d_o_m_a_i_n is
``bound.''
The default _d_o_m_a_i_n is the value of TTEEXXTTDDOOMMAAIINN. If
_d_i_r_e_c_t_o_r_y is the null string (""""), then bbiinnddtteexxttddoo
mmaaiinn(()) returns the current binding for the given
_d_o_m_a_i_n.
ddccggeetttteexxtt((_s_t_r_i_n_g [,, _d_o_m_a_i_n [,, _c_a_t_e_g_o_r_y]]))
Returns the translation of _s_t_r_i_n_g in text domain
_d_o_m_a_i_n for locale category _c_a_t_e_g_o_r_y. The default
value for _d_o_m_a_i_n is the current value of TTEEXXTTDDOO
MMAAIINN. The default value for _c_a_t_e_g_o_r_y is ""LLCC__MMEESS
SSAAGGEESS"".
If you supply a value for _c_a_t_e_g_o_r_y, it must be a
string equal to one of the known locale categories
described in _G_A_W_K_: _E_f_f_e_c_t_i_v_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g. You
must also supply a text domain. Use TTEEXXTTDDOOMMAAIINN if
you want to use the current domain.
ddccnnggeetttteexxtt((_s_t_r_i_n_g_1 , _s_t_r_i_n_g_2 , _n_u_m_b_e_r [,, _d_o_m_a_i_n [,, _c_a_t_e_
_g_o_r_y]]))
Returns the plural form used for _n_u_m_b_e_r of the
translation of _s_t_r_i_n_g_1 and _s_t_r_i_n_g_2 in text domain
_d_o_m_a_i_n for locale category _c_a_t_e_g_o_r_y. The default
value for _d_o_m_a_i_n is the current value of TTEEXXTTDDOO
MMAAIINN. The default value for _c_a_t_e_g_o_r_y is ""LLCC__MMEESS
SSAAGGEESS"".
If you supply a value for _c_a_t_e_g_o_r_y, it must be a
string equal to one of the known locale categories
described in _G_A_W_K_: _E_f_f_e_c_t_i_v_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g. You
must also supply a text domain. Use TTEEXXTTDDOOMMAAIINN if
you want to use the current domain.
UUSSEERR--DDEEFFIINNEEDD FFUUNNCCTTIIOONNSS
Functions in AWK are defined as follows:
ffuunnccttiioonn _n_a_m_e((_p_a_r_a_m_e_t_e_r _l_i_s_t)) {{ _s_t_a_t_e_m_e_n_t_s }}
Functions are executed when they are called from within
expressions in either patterns or actions. Actual parame
ters supplied in the function call are used to instantiate
the formal parameters declared in the function. Arrays
are passed by reference, other variables are passed by
value.
Since functions were not originally part of the AWK lan
guage, the provision for local variables is rather clumsy:
They are declared as extra parameters in the parameter
list. The convention is to separate local variables from
real parameters by extra spaces in the parameter list.
For example:
ffuunnccttiioonn ff((pp,, qq,, aa,, bb)) ## aa aanndd bb aarree llooccaall
{{
......
}}
//aabbcc// {{ ...... ;; ff((11,, 22)) ;; ...... }}
The left parenthesis in a function call is required to
immediately follow the function name, without any inter
vening white space. This is to avoid a syntactic ambigu
ity with the concatenation operator. This restriction
does not apply to the built-in functions listed above.
Functions may call each other and may be recursive. Func
tion parameters used as local variables are initialized to
the null string and the number zero upon function invoca
tion.
Use rreettuurrnn _e_x_p_r to return a value from a function. The
return value is undefined if no value is provided, or if
the function returns by "falling off" the end.
If ----lliinntt has been provided, _g_a_w_k warns about calls to
undefined functions at parse time, instead of at run time.
Calling an undefined function at run time is a fatal
error.
The word ffuunncc may be used in place of ffuunnccttiioonn.
DDYYNNAAMMIICCAALLLLYY LLOOAADDIINNGG NNEEWW FFUUNNCCTTIIOONNSS
Beginning with version 3.1 of _g_a_w_k, you can dynamically
add new built-in functions to the running _g_a_w_k inter
preter. The full details are beyond the scope of this
manual page; see _G_A_W_K_: _E_f_f_e_c_t_i_v_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g for the
details.
eexxtteennssiioonn((_o_b_j_e_c_t,, _f_u_n_c_t_i_o_n))
Dynamically link the shared object file named by
_o_b_j_e_c_t, and invoke _f_u_n_c_t_i_o_n in that object, to
perform initialization. These should both be pro
vided as strings. Returns the value returned by
_f_u_n_c_t_i_o_n.
TThhiiss ffuunnccttiioonn iiss pprroovviiddeedd aanndd ddooccuummeenntteedd iinn _G_A_W_K_: _E_f_f_e_c_
_t_i_v_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g,, bbuutt eevveerryytthhiinngg aabboouutt tthhiiss ffeeaattuurree iiss
lliikkeellyy ttoo cchhaannggee iinn tthhee nneexxtt rreelleeaassee.. WWee SSTTRROONNGGLLYY rreeccoomm
mmeenndd tthhaatt yyoouu ddoo nnoott uussee tthhiiss ffeeaattuurree ffoorr aannyytthhiinngg tthhaatt
yyoouu aarreenn''tt wwiilllliinngg ttoo rreeddoo..
SSIIGGNNAALLSS
_p_g_a_w_k accepts two signals. SSIIGGUUSSRR11 causes it to dump a
profile and function call stack to the profile file, which
is either aawwkkpprrooff..oouutt, or whatever file was named with the
----pprrooffiillee option. It then continues to run. SSIIGGHHUUPP
causes it to dump the profile and function call stack and
then exit.
EEXXAAMMPPLLEESS
Print and sort the login names of all users:
BBEEGGIINN {{ FFSS == ""::"" }}
{{ pprriinntt $$11 || ""ssoorrtt"" }}
Count lines in a file:
{{ nnlliinneess++++ }}
EENNDD {{ pprriinntt nnlliinneess }}
Precede each line by its number in the file:
{{ pprriinntt FFNNRR,, $$00 }}
Concatenate and line number (a variation on a theme):
{{ pprriinntt NNRR,, $$00 }}
IINNTTEERRNNAATTIIOONNAALLIIZZAATTIIOONN
String constants are sequences of characters enclosed in
double quotes. In non-English speaking environments, it
is possible to mark strings in the AWK program as requir
ing translation to the native natural language. Such
strings are marked in the AWK program with a leading
underscore ("_"). For example,
ggaawwkk ''BBEEGGIINN {{ pprriinntt ""hheelllloo,, wwoorrlldd"" }}''
always prints hheelllloo,, wwoorrlldd. But,
ggaawwkk ''BBEEGGIINN {{ pprriinntt __""hheelllloo,, wwoorrlldd"" }}''
might print bboonnjjoouurr,, mmoonnddee in France.
There are several steps involved in producing and running
a localizable AWK program.
1. Add a BBEEGGIINN action to assign a value to the TTEEXXTTDDOOMMAAIINN
variable to set the text domain to a name associated
with your program.
BBEEGGIINN {{ TTEEXXTTDDOOMMAAIINN == ""mmyypprroogg"" }}
This allows _g_a_w_k to find the ..mmoo file associated with
your program. Without this step, _g_a_w_k uses the mmeess
ssaaggeess text domain, which likely does not contain
translations for your program.
2. Mark all strings that should be translated with lead
ing underscores.
3. If necessary, use the ddccggeetttteexxtt(()) and/or bbiinnddtteexxttddoo
mmaaiinn(()) functions in your program, as appropriate.
4. Run ggaawwkk ----ggeenn--ppoo --ff mmyypprroogg..aawwkk >> mmyypprroogg..ppoo to gener
ate a ..ppoo file for your program.
5. Provide appropriate translations, and build and
install a corresponding ..mmoo file.
The internationalization features are described in full
detail in _G_A_W_K_: _E_f_f_e_c_t_i_v_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g.
PPOOSSIIXX CCOOMMPPAATTIIBBIILLIITTYY
A primary goal for _g_a_w_k is compatibility with the POSIX
standard, as well as with the latest version of UNIX _a_w_k.
To this end, _g_a_w_k incorporates the following user visible
features which are not described in the AWK book, but are
part of the Bell Laboratories version of _a_w_k, and are in
the POSIX standard.
The book indicates that command line variable assignment
happens when _a_w_k would otherwise open the argument as a
file, which is after the BBEEGGIINN block is executed. How
ever, in earlier implementations, when such an assignment
appeared before any file names, the assignment would hap
pen _b_e_f_o_r_e the BBEEGGIINN block was run. Applications came to
depend on this "feature." When _a_w_k was changed to match
its documentation, the --vv option for assigning variables
before program execution was added to accommodate applica
tions that depended upon the old behavior. (This feature
was agreed upon by both the Bell Laboratories and the GNU
developers.)
The --WW option for implementation specific features is from
the POSIX standard.
When processing arguments, _g_a_w_k uses the special option
"--" to signal the end of arguments. In compatibility
mode, it warns about but otherwise ignores undefined
options. In normal operation, such arguments are passed
on to the AWK program for it to process.
The AWK book does not define the return value of ssrraanndd(()).
The POSIX standard has it return the seed it was using, to
allow keeping track of random number sequences. Therefore
ssrraanndd(()) in _g_a_w_k also returns its current seed.
Other new features are: The use of multiple --ff options
(from MKS _a_w_k); the EENNVVIIRROONN array; the \\aa, and \\vv escape
sequences (done originally in _g_a_w_k and fed back into the
Bell Laboratories version); the ttoolloowweerr(()) and ttoouuppppeerr(())
built-in functions (from the Bell Laboratories version);
and the ANSI C conversion specifications in pprriinnttff (done
first in the Bell Laboratories version).
HHIISSTTOORRIICCAALL FFEEAATTUURREESS
There are two features of historical AWK implementations
that _g_a_w_k supports. First, it is possible to call the
lleennggtthh(()) built-in function not only with no argument, but
even without parentheses! Thus,
aa == lleennggtthh ## HHoollyy AAllggooll 6600,, BBaattmmaann!!
is the same as either of
aa == lleennggtthh(())
aa == lleennggtthh(($$00))
This feature is marked as "deprecated" in the POSIX stan
dard, and _g_a_w_k issues a warning about its use if ----lliinntt is
specified on the command line.
The other feature is the use of either the ccoonnttiinnuuee or the
bbrreeaakk statements outside the body of a wwhhiillee, ffoorr, or ddoo
loop. Traditional AWK implementations have treated such
usage as equivalent to the nneexxtt statement. _G_a_w_k supports
this usage if ----ttrraaddiittiioonnaall has been specified.
GGNNUU EEXXTTEENNSSIIOONNSS
_G_a_w_k has a number of extensions to POSIX _a_w_k. They are
described in this section. All the extensions described
here can be disabled by invoking _g_a_w_k with the ----ttrraaddii
ttiioonnaall option.
The following features of _g_a_w_k are not available in POSIX
_a_w_k.
No path search is performed for files named via the --ff
option. Therefore the AAWWKKPPAATTHH environment variable is
not special.
The \\xx escape sequence. (Disabled with ----ppoossiixx.)
The fffflluusshh(()) function. (Disabled with ----ppoossiixx.)
The ability to continue lines after ?? and ::. (Disabled
with ----ppoossiixx.)
Octal and hexadecimal constants in AWK programs.
The AARRGGIINNDD, BBIINNMMOODDEE, EERRRRNNOO, LLIINNTT, RRTT and TTEEXXTTDDOOMMAAIINN
variables are not special.
The IIGGNNOORREECCAASSEE variable and its side-effects are not
available.
The FFIIEELLDDWWIIDDTTHHSS variable and fixed-width field split
ting.
The PPRROOCCIINNFFOO array is not available.
The use of RRSS as a regular expression.
The special file names available for I/O redirection are
not recognized.
The ||&& operator for creating co-processes.
The ability to split out individual characters using the
null string as the value of FFSS, and as the third argu
ment to sspplliitt(()).
The optional second argument to the cclloossee(()) function.
The optional third argument to the mmaattcchh(()) function.
The ability to use positional specifiers with pprriinnttff and
sspprriinnttff(()).
The use of ddeelleettee _a_r_r_a_y to delete the entire contents of
an array.
The use of nneexxttffiillee to abandon processing of the current
input file.
The aanndd(()), aassoorrtt(()), bbiinnddtteexxttddoommaaiinn(()), ccoommppll(()), ddccggeett
tteexxtt(()), ggeennssuubb(()), llsshhiifftt(()), mmkkttiimmee(()), oorr(()), rrsshhiifftt(()),
ssttrrffttiimmee(()), ssttrrttoonnuumm(()), ssyyssttiimmee(()) and xxoorr(()) functions.
Localizable strings.
Adding new built-in functions dynamically with the
eexxtteennssiioonn(()) function.
The AWK book does not define the return value of the
cclloossee(()) function. _G_a_w_k's cclloossee(()) returns the value from
_f_c_l_o_s_e(3), or _p_c_l_o_s_e(3), when closing an output file or
pipe, respectively. It returns the process's exit status
when closing an input pipe. The return value is -1 if the
named file, pipe or co-process was not opened with a redi
rection.
When _g_a_w_k is invoked with the ----ttrraaddiittiioonnaall option, if the
_f_s argument to the --FF option is "t", then FFSS is set to the
tab character. Note that typing ggaawwkk --FF\\tt ...... simply
causes the shell to quote the "t,", and does not pass "\t"
to the --FF option. Since this is a rather ugly special
case, it is not the default behavior. This behavior also
does not occur if ----ppoossiixx has been specified. To really
get a tab character as the field separator, it is best to
use single quotes: ggaawwkk --FF''\\tt'' .......
EENNVVIIRROONNMMEENNTT VVAARRIIAABBLLEESS
The AAWWKKPPAATTHH environment variable can be used to provide a
list of directories that _g_a_w_k searches when looking for
files named via the --ff and ----ffiillee options.
If PPOOSSIIXXLLYY__CCOORRRREECCTT exists in the environment, then _g_a_w_k
behaves exactly as if ----ppoossiixx had been specified on the
command line. If ----lliinntt has been specified, _g_a_w_k issues a
warning message to this effect.
SSEEEE AALLSSOO
_e_g_r_e_p(1), _g_e_t_p_i_d(2), _g_e_t_p_p_i_d(2), _g_e_t_p_g_r_p(2), _g_e_t_u_i_d(2),
_g_e_t_e_u_i_d(2), _g_e_t_g_i_d(2), _g_e_t_e_g_i_d(2), _g_e_t_g_r_o_u_p_s(2)
_T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, Alfred V. Aho, Brian W.
Kernighan, Peter J. Weinberger, Addison-Wesley, 1988.
ISBN 0-201-07981-X.
_G_A_W_K_: _E_f_f_e_c_t_i_v_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g, Edition 3.0, published by
the Free Software Foundation, 2001.
BBUUGGSS
The --FF option is not necessary given the command line
variable assignment feature; it remains only for backwards
compatibility.
Syntactically invalid single character programs tend to
overflow the parse stack, generating a rather unhelpful
message. Such programs are surprisingly difficult to
diagnose in the completely general case, and the effort to
do so really is not worth it.
AAUUTTHHOORRSS
The original version of UNIX _a_w_k was designed and imple
mented by Alfred Aho, Peter Weinberger, and Brian
Kernighan of Bell Laboratories. Brian Kernighan continues
to maintain and enhance it.
Paul Rubin and Jay Fenlason, of the Free Software Founda
tion, wrote _g_a_w_k, to be compatible with the original ver
sion of _a_w_k distributed in Seventh Edition UNIX. John
Woods contributed a number of bug fixes. David Trueman,
with contributions from Arnold Robbins, made _g_a_w_k compati
ble with the new version of UNIX _a_w_k. Arnold Robbins is
the current maintainer.
The initial DOS port was done by Conrad Kwok and Scott
Garfinkle. Scott Deifik is the current DOS maintainer.
Pat Rankin did the port to VMS, and Michal Jaegermann did
the port to the Atari ST. The port to OS/2 was done by
Kai Uwe Rommel, with contributions and help from Darrel
Hankerson. Fred Fish supplied support for the Amiga,
Stephen Davies provided the Tandem port, and Martin Brown
provided the BeOS port.
VVEERRSSIIOONN IINNFFOORRMMAATTIIOONN
This man page documents _g_a_w_k, version 3.1.0.
BBUUGG RREEPPOORRTTSS
If you find a bug in _g_a_w_k, please send electronic mail to
bbuugg--ggaawwkk@@ggnnuu..oorrgg. Please include your operating system
and its revision, the version of _g_a_w_k (from ggaawwkk ----vveerr
ssiioonn), what C compiler you used to compile it, and a test
program and data that are as small as possible for repro
ducing the problem.
Before sending a bug report, please do two things. First,
verify that you have the latest version of _g_a_w_k. Many
bugs (usually subtle ones) are fixed at each release, and
if yours is out of date, the problem may already have been
solved. Second, please read this man page and the
reference manual carefully to be sure that what you think
is a bug really is, instead of just a quirk in the lan
guage.
Whatever you do, do NNOOTT post a bug report in
ccoommpp..llaanngg..aawwkk. While the _g_a_w_k developers occasionally
read this newsgroup, posting bug reports there is an unre
liable way to report bugs. Instead, please use the elec
tronic mail addresses given above.
AACCKKNNOOWWLLEEDDGGEEMMEENNTTSS
Brian Kernighan of Bell Laboratories provided valuable
assistance during testing and debugging. We thank him.
CCOOPPYYIINNGG PPEERRMMIISSSSIIOONNSS
Copyright 1989, 1991, 1992, 1993, 1994, 1995, 1996,
1997, 1998, 1999, 2001, 2002 Free Software Foundation,
Inc.
Permission is granted to make and distribute verbatim
copies of this manual page provided the copyright notice
and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified ver
sions of this manual page under the conditions for verba
tim copying, provided that the entire resulting derived
work is distributed under the terms of a permission notice
identical to this one.
Permission is granted to copy and distribute translations
of this manual page into another language, under the above
conditions for modified versions, except that this permis
sion notice may be stated in a translation approved by the
Foundation.
Free Software Foundation Apr 16 2002 GAWK(1)